ApproXQL: Design and Implementation of an Approximate Pattern Matching Language for XML
نویسنده
چکیده
We introduce the simple query language approXQL, which supports hierarchical, Booleanconnected query patterns. The interpretation of approXQL queries is founded on cost-based query transformations: The total cost of a sequence of transformations measures the similarity between a query and the data and is used to rank the results. We describe in detail the implementation of the approXQL query processor, which uses an expanded query representation and sophisticated indexes to compute all results of a query in polynomial – typically sublinear – time with respect to the database size.
منابع مشابه
Schema-Driven Evaluation of ApproXQL Queries
Query engines for heterogeneous collections of XML data should retrieve exact results – but also answers that are similar to the query. In this paper, we present a simple pattern matching language, which supports hierarchical, Boolean-connected query patterns. The interpretation of a query is founded on cost-based query transformations: The total cost of a sequence of transformations measures t...
متن کاملSimilarity Search in XML Data using Cost-Based Query Transformations
XML query engines should support structured queries. They should retrieve exact matches as well as results similar to the query. In this paper, we introduce the simple query language approXQL that supports hierarchical, Boolean-connected query patterns. The interpretation of approXQL queries is founded on cost-based query transformations: The total cost of a sequence of transformations measures...
متن کاملPhil: A Lazy Implementation of a Language for Approximate Filtering of XML Documents
In this paper, we introduce a system, written in Haskell, for filtering information from XML data. Essentially, the system implements a simple declarative language which allows one to extract relevant data as well as to exclude useless and misleading contents from an XML document by matching patterns against XML documents. The matching mechanism employes a cost-based pattern transformation algo...
متن کاملRUN, Xtatic, RUN: EFFICIENT IMPLEMENTATION OF AN OBJECT-ORIENTED LANGUAGE WITH REGULAR PATTERN MATCHING
RUN, Xtatic, RUN: EFFICIENT IMPLEMENTATION OF AN OBJECT-ORIENTED LANGUAGE WITH REGULAR PATTERN MATCHING Michael Y. Levin Benjamin C. Pierce Schema languages such as DTD, XML Schema, and Relax NG have been steadily growing in importance in the XML community. A schema language provides a mechanism for defining the type of XML documents; i.e., the set of constraints that specify the structure of X...
متن کاملAdaptive Approximate Record Matching
Typographical data entry errors and incomplete documents, produce imperfect records in real world databases. These errors generate distinct records which belong to the same entity. The aim of Approximate Record Matching is to find multiple records which belong to an entity. In this paper, an algorithm for Approximate Record Matching is proposed that can be adapted automatically with input error...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001